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Abstract. The authors propose a recycling MINRES scheme for a solution of subsequent 
self-adjoint linear systems as appearing, for example, in the Newton process for solving nonlinear 
equations. Ritz vectors are automatically extracted from one MINRES run and then used for self- 
adjoint deflation in the next. The method is designed to work with a preconditioner and arbitrary 
inner products. Numerical experiments with nonlinear Schrodinger equations indicate a substantial 
decrease in computation time when recycling is used. 
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1. Introduction. Sequences of linear algebraic systems frequently occur in the 
numerical solution process of various kinds of problems. Most notable are implicit 
time stepping schemes and Newton's method for solving nonlinear equation systems. 
It is often the case that the operators in subsequent linear systems have similar spec- 
tral properties or are in fact equal. To exploit this, a common approach is to factorize 
the operator once and apply the factorization to the following steps. However, this 
strategy typically has high memory requirements and is thus hardly applicable to 
problems with many unknowns. Also, it is not applicable if subsequent linear opera- 
tors are even only slightly different from each other. 

The authors use the idea of an alternative approach that carries over spectral 
information from one linear system to the next by extracting approximations of eigen- 
vectors and using them in a deflation framework |TSl El EH [19] . For a more detailed 
overview on the background of such methods, see [3]. The method is designed for 
Krylov subspace methods in general and is worked out in this paper for the MIN- 
RES method [M] in particular. This is motivated by the fact that all observables of 
quantum-dynamical systems are self-adjoint. 

The idea of recycling spectral information in Krylov subspace methods is not new. 
Notably, Wang, de Sturler, and Paulino proposed the RMINRES method that 
includes the extraction of approximate eigenvectors, i.e., Ritz vectors and a modifica- 
tion of the MINRES method that includes these vectors in the search space for the 
following linear systems (augmentation) . The RMINRES method with augmentation 
of the search space is mathematically equivalent to the standard MINRES method 
applied to a projected linear system [5J. Krylov subspace methods that are applied 
to projected linear systems are often called deflated methods. In literature, both aug- 
mented and deflated methods have been used in a variety of settings; we refer to the 
review article by Simoncini and Szyld [31j for a comprehensive overview. 

The method proposed here differs from RMINRES in that it preserves self- 
adjointness and allows for arbitrary preconditioners and inner products (see sec- 
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tion 2.2 1. As an added benefit, the approach can also incorporate spectral information 
that is explicitly provided. This is of particular interest for problems which are known 
to have a small number of eigenvalues that harm the the Krylov convergence, e.g., 
eigenvalues of large or small magnitude. An important field of application here are 
nonlinear problems that bear some kind of continuous symmetry, e.g., gauge invari- 
ance [17j . Even if distinctly definite, linearizations of these problems around any 
nontrivial solution typically have a nontrivial null space, i.e., one or more eigenvalues 
which are exactly 0. 

As a landmark example for this class of problems, we will consider nonlinear 
Schrodinger equations. Nonlinear Schrodinger equations and their variations are used 
to describe a wide variety of physical systems, most notably in superconductivity, 
quantum condensates, nonlinear acoustics |34| . nonlinear optics |10j . and hydrody- 
namics [55]. Although the existence of steady state solutions is proven in certain 
cases, analytic expressions for realistic systems are often impossible to obtain. Hence, 
numerical solution methods are of particular importance for gaining insight in the 
physics of the system. For the solution of nonlinear Schrodinger equations with New- 
ton's method, a linear system has to be solved with the Jacobian operator for each 
Newton update. The Jacobian operator is self-adjoint with respect to a nonstandard 
inner product which renders most default approaches unusable. In general, the Jaco- 
bian operator is indefinite and thus the MINRES method with the appropriate inner 
product is a suitable choice for solving the occurring linear systems. In order to be 
applicable in practice, the MINRES method has to be combined with a preconditioner 
that is able to limit the number of MINRES iterations to a feasible extent [29 . Due 
to the special structure of the nonlinear Schrodinger equation, the Jacobian operator 
exhibits one eigenvalue that moves to zero when the Newton iterate converges to a 
nontrivial solution and is exactly zero in a solution. Because this situation only occurs 
in the last step, no linear system has to be solved with an exactly singular Jacobian 
operator but the severely ill-conditioned Jacobian operators in the final Newton steps 
lead to convergence slowdown or stagnation in the MINRES method even when a 
preconditioner is applied. One important instance of nonlinear Schrodinger equations 
is the Ginzburg-Landau equation that models phenomena of certain superconduc- 
tors. We propose a recycling MINRES method and show how it can help to improve 
the convergence of the MINRES method and thus the overall time consumption of 
Newton's method for the Ginzburg-Landau equation. 

The paper is organized as follows: section [2] gives a brief overview on the precon- 
ditioned MINRES method for an arbitrary nonsingular linear operator that is self- 
adjoint with respect to an arbitrary inner product. The deflated MINRES method is 
described in subsection 1 2 . 2 1 while subsection |2.3| presents the computation of Ritz vec- 
tors and explains their use in the overall algorithm for the solution of a sequence of self- 
adjoint linear systems. In section^ this algorithm is applied to the Ginzburg-Landau 



equation. Subsections 3.1 and |3.2| deal with the numerical treatment of nonlinear 
Schrodinger equations in general and the Ginzburg-Landau equation in particular. 
In subsection |3.3| numerical results for typical two- and three-dimensional setups are 
presented. 

2. MINRES. 

2.1. Preconditioned MINRES with arbitrary inner product. This sec- 
tion presents well-known properties of the preconditioned MINRES method. As op- 
posed to ordinary textbook presentations this section incorporates a general Hilbert 
space. For K S {]R,C} let H be a K-Hilbert space with inner product (-,-)h an< ^ 
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induced norm \\-\\ H - Throughout this paper the inner product (•, ■) H is linear in the 
first and anti-linear in the second argument and we define L(H) := {£ : H — > 
H | C is linear and bounded}. We wish to obtain x £ H from 

Ax = b (2.1) 

where A £ L(H) is (•, -^-self-adjoint and invertible and b £ H. The self-adjointness 
implies that the spectrum o~(A) is real. However, we do not assume that A is definite. 
If an initial guess x £ H is given, we can approximate x by iterates 

x n = x + y n with y n £ K n {A, r ) (2.2) 

where r = b — Ax is the initial residual and K, n (A, ro) = span{r , Ar , . . . , A n ~ 1 ro} 
is the nth Krylov subspace generated with A and ro- We concentrate on minimal 



residual methods here, i.e., methods that construct iterates of the form (2.2 1 such 
that the residual r n := b — Ax n has minimal ||-||^-norm, that is 

\\ r n\\ H = \\b - Ax n \\ H = \\b - A(x + y n )\\ H = \\r - Ay n \\ H 

= mm \\r -Ay\\ H = mm \\p(A)r \\ H (2.3) 

y£K.n(A,r ) pSil'^ 

where 11° is the set of polynomials of degree at most n with p(0) — 1. For a general 
invertible linear operator A, the minimization problem in ( |2.3[ ) can be solved by 
the GMRES method [53] which is equivalent to the MINRES method [H] if A is self- 
adjoint because then the Lanczos algorithm can be used to obtain a (•, -) ff -orthonormal 
basis of the Krylov subspace K. n (A, r ). 

To facilitate subsequent definitions and statements for general Hilbert spaces, we 
use a block notation for inner products that generalizes the common block notation 
for matrices: 

Definition 2.1. For k, I £ N and two tuples of vectors X = [xi,. . . ,x k ] £ H k 
and Y = [y%, . .. ,yi] £ H l the block inner product (-,-) H ■ H k x H l — > K fe: ' is defined 
by 

(X,Y) H := [(a;;, i=:Li ..., fe . 

3=1,...,/ 

A block X £ H k can be right-multiplied with a matrix just as in the plain matrix 
case: 

Definition 2.2. For X £ H k and Z = [2ij]i=i k £ K. k ' 1 , right multiplication is 

j=i,..'.,i 

defined by 



XZ := 



' fc 

5^ Z i j X i 



£ H l . 



J J 1 1 



Because the MINRES method and the underlying Lanczos algorithm is often 
stated for Hermitian matrices only (i.e., for the Euclidean inner product), we recall 
very briefly some properties of the Lanczos algorithm for a linear operator that is self- 
adjoint with respect to an arbitrary inner product (-,-) H . If the Lanczos algorithm 
with inner product (•,•)# applied to A and the initial vector v\ = ro/ ||^o|| jj has 
completed the nth iteration, the Lanczos relation 



AV n = V n+1 % 



(2.4) 



4 



holds, where the elements of V n +i = ■ ■ ■ , v n +i] £ H n+1 form a (•, -) H -orthonormal 
basis of JC n+1 (A,r ), i.e., span{u x , . . . , u n+ i} = IC n+1 (A,r ) and (14+i,K+i) H = 
I n +i. The matrix T n € K n + 1 i™ is real- valued (even if K = C), symmetric, and 
tridiagonal with 



Ik = [{AVi,Vj) H ]i=i,..., n 



+ 1- 



The nth approximation of the solution of the linear system (2.1 ) generated with the 
MINRES method and the corresponding residual norm, cf. (2.2 1 and (2.3|, can then 
be expressed as 

x n — x a + Vn z n with z n £ K™ and 
\\ r n\\ H = \\ r o -AV n z n \\ H = ||T4 + i(||r || H ei -T n z n )\\ H = ||||r || H ei -T„z„|| 2 . 

By iteratively computing a QR decomposition of T n , the minimization problem in 



(2.3) can be solved without storing the full matrix T n and, more importantly, the full 
Lanczos basis V n . 

If N := dimi? < oo and the elements of U £ H" form a (•, -) H -orthonormal 
eigenvector basis of H, then AU — UD for the diagonal matrix D = diag(Ai, . . . , Ajv) 
with A's eigenvalues Ai, . . . , Ajv £ K on the diagonal. Let £ K N be the represen- 
tation of ro in the basis U, i.e., r$ = Ur^ . According to (2.3), the residual norm of 



the nth approximation obtained with MINRES can be expressed as 

l|r " llff = S W p{A)Ur °\\H = ™J| \\ U PP>) r ah = p ™ n o \\pW r o\\ 2 

<||r y || 2 min||p(D)|| 2 . (2.5) 

From \\r%\\ 2 = H^Vo]^ = \\r \\ H and ||p(D)|| 2 = m£K ie{lr >>)Jv j \p{Xi)\, we obtain the 
well-known MINRES worst-case bound for the relative residual norm [T2 ITS] 

< min max IpfA,-)]. (2.6) 

This can be estimated even further upon letting the eigenvalues of A be sorted 
such that Ai < . . . < A s < < A s+ i < . . . < Ajv for a s £ Nq. By replacing the 



discrete set of eigenvalues in (2.6 1 by the union of the two intervals / := [Ai, A s ] and 
I + := [A s+ i,Aiv], one gets 



r 



",, H < min max |p(Aj)| < min max \p(X)\ 
\\ro\\ H pen° ie{i,...,JV} P gn; Ae/-u/+ 

< 2 / ^a7a^-^a~a^ \ ln/2] , 2 ?) 

" \^v1^M + VI^+il/ 

where [n/2] is the integer part of n/2, cf. [T5J[T5]. Note that this estimate does not 
take into account the actual distribution of the eigenvalues in the intervals / _ and 
/ + . In practice a better convergence behavior than the one suggested by the estimate 
above can often be observed. 

In most applications, the MINRES method is only feasible when it is applied with 
a preconditioner. Consider the preconditioned system 

M~ 1 Ax = M~ 1 b (2.8) 
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where Ai : H — > H is a (•, -^-self-adjoint, invertible, and positive-definite linear 
operator. Note that Mr 1 A is not (•, -^-self-adjoint but self-adjoint with respect to 
the inner product {-,-) M defined by (x,y) M := (Mx,y) H = (x,My) H . The MINRES 
method is then applied to (2.8 1 with the inner product and thus minimizes 



\\A4 1 6 — M 1 Ax^ M . From an algorithmic point of view it is worthwhile to note 
that only the application of Mr~ x is needed and the application of Ai for the inner 
products can be carried out implicitly; cf. [7J chapter 6]. The convergence bound in 



(2.7) then refers to the eigenvalues of Ai 1 A and the goal of preconditioning is to 



achieve a more favorable spectrum of Ai 1 A with an appropriate Ai . 

2.2. Deflated MINRES. In many applications even with the aid of a precon- 
ditioner the convergence of MINRES is hampered - often due to the presence of one 
or a few eigenvalues close to zero that are isolated from the remaining spectrum. This 
case has recently been studied by Simoncini and Szyld |32j . Their analysis and nu- 
merical experiments show that an isolated simple eigenvalue can cause stagnation of 
the residual norm until a harmonic Ritz value approximates the outlying eigenvalue 
well. 

Two strategies are well-known in the literature to circumvent the stagnation or 
slowdown in the convergence of preconditioned Krylov subspace methods described 
above: augmentation and deflation. In augmented methods the Krylov subspace is 
enlarged by a suitable subspace that contains useful information about the problem. 
In deflation techniques the operator is modified with a suitably chosen projection in 
order to "eliminate" components that hamper convergence (e.g., eigenvalues close to 
the origin) . For an extensive overview of these techniques we refer to the survey article 
by Simoncini and Szyld [31]. Both techniques are closely intertwined and even turn 
out to be equivalent in some cases [5]. Here, we concentrate on deflated methods and 
first give a brief description of the recycling MINRES (RMINRES) method introduced 
by Wang, de Sturler, and Paulino [SS] before presenting a slightly different approach. 

The RMINRES method by Wang, de Sturler, and Paulino [36] is mathematically 
equivalent [8] to the application of the MINRES method to the "deflated" equation 

V x Ax = Vxb (2.9) 

where for a given fc-tuple U £ H k of linearly independent vectors (which constitute 
a basis of the "recycled" space) and C := AU, the linear operator V\ : H — > H 
is defined by V\x :— x — C(C,C) H (C,x) H for an arbitrary inner product. Note 
that, although V\ is a (•, ^-self-adjoint projection, V\A in general is not. How- 
ever, as outlined in [36] section 4] an orthonormal basis of the Krylov subspace can 
still be generated with MINRES' short recurrences and the operator V\A because 



K n {V\A, V\r§) = K, n {ViAVl ,Viro) . Solutions of equation (2.9) are not unique for 



k > and thus x was replaced by x. To obtain an approximation x n of the original 



solution x from the approximation x n generated with MINRES applied to (2.9), an 
additional correction has to be applied: 

x n =V 1 x n + U(C,C)- 1 (C,b) H , 

where V\ : H — > H is defined by V\X :— x — U(C,C) H L (C,Ax) H . As outlined in 
section 5], the RMINRES method may suffer from breakdowns which can be cured 
by using the adapted initial guess x := Vix + C(C, C)^ (U, b) H . 

Let us now turn to a slightly different deflation technique for MINRES which 
we formulate with preconditioning directly. We will use a projection which has been 



() 



developed in the context of the CG method for Hermitian and positive-definite oper- 
ators [STJ |31 [35]. Under a mild assumption, this projection can also be used in the 
indefinite case with the MINRES method. Our goal is to use approximations to eigen- 
vectors corresponding to eigenvalues that hamper convergence in order to modify the 
operator with a projection. Consider the preconditioned equation (2.8 1 and assume 
for a moment that the elements of U = [ui, . . . , Uk] G H k form a (•, -^-orthonormal 
basis consisting of eigenvectors of Mr 1 A, i.e., Mr x AlJ = UD with a diagonal matrix 
D = diag(Ai,...,A fc ) £ R k ' k . Then (U, M~ X AU) M = (U,U) M T> = D is nonsingular 
because we assumed that A is invertible. This motivates the following definition: 

Definition 2.3. Let M,A S L(H) be invertible and (•, •) H -self-adjoint operators 
with positive- definite M. Let U e H k be such that (U, M~ 1 AU) M = (U,AU) H is 
nonsingular. We define the projections Vm , V : H — > H by 



V M x :=x-Mr x AXJ(jJ,M.~ x AU)j^(U,x) 
Vx :^x~AU(U,AU) H 1 (U,x) H . 



M 



(2.10) 



The assumption in definition 2.3 that (U, M 1 AU) M is nonsingular holds if 



and only if range(vW -1 .4£/) n range([/)- L - M = {0} or equivalently if range(_4[7) n 
range([/)^ Lff = {0} which is fulfilled for good-enough approximations of eigenvectors. 
Applying the projection Vm to the preconditioned equation (2.8) yields the deflated 
equation 

VmM^Ax = V M M- 1 b. (2.11) 
The following lemma states some important properties of the operator V mM- 1 A. 
Lemma 2.4. Let the assumptions in definition \ 2. 3\ hold. Then 



h ■ 



1. VmM- 1 =M- l V. 

2. VA — AP* where V* is the adjoint operator of V with respect to 
defined by V*x = x - U(U,AU) H 1 (AU,x) H . 

3. V M M.- 1 A = M.- X VA = Mr x AV* is self-adjoint with respect to (■, ■) M . 
For each initial guess xq € H, the MINRES method with inner product (•, ■) M 
applied to equation (2.11) is well defined at each iteration until it terminates 



4 



with a solution of the system. 

Ifx n is the nth approximation and VM-M~ 1 b — VmM^ 1 Ax„ the correspond- 
ing residual generated by the MINRES method with inner product (•, •) M ap- 



plied to (2.11) with initial guess Xq G H , then the corrected approximation 



fulfills 



x n := V*x n + U(U,AU) H 1 (U 1 b) H 



M~ l b - M- x Ax n = V M M~ x b - VmM^Ax,, 



(2.12) 



(2.13) 



(Note that (2.13) also holds for n = 0.) 
Proof. 1., 2. and the equation in 3. follow from elementary calculations. Because 

(VmM- 1 Ax, y) = (VAx,y) H = (Ax,V*y) H - (x,AV*y) H = (x,VAy) H 



(x,V M M- l Ay) 



M' 
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holds for all x,y € H, the operator TmM~ 1 A is self-adjoint with respect to (•, ■) M . 

4. has been proved in [5J Theorem 4.1] for GMRES with the Euclidean inner 
product which can easily be generalized to arbitrary inner products. GMRES is 
mathematically equivalent to MINRES in our case due to the fact that V mM-~ x A is 
self-adjoint. 

5. follows from 1. and 3. by direct calculation: 

M~ l b - M~ l Ax n = M~ l {b - AU{U, AU)~^{U, b) H ) - M~ l AV*x n 

= Mr Y Vb - V M M~ l Ax n = V M MT X b - V M M~ x Ax n . 



Now that we know that MINRES is well-defined when applied to the deflated 



and preconditioned equation (2.111 we want to investigate the convergence behavior 



in comparison to the original preconditioned equation (2. 

Lemma 2.5. Let the assumptions in definition 2.3 and N := dimH < oo hold. 
If o~(A4~ 1 A) = {Ai,...,Ajv} is the spectrum of the preconditioned operator Mr 1 A 
and for k > the elements of ' U € H k form a basis of the Mr 1 A-invariant subspace 
corresponding to the eigenvalues Ai, . . . , then the following holds: 
1. The spectrum of the deflated operator Vm-M -1 A is 



o-(J> M M- Y A) = {0} U {Afc+i, ■ ■ ■ , Aat}. 



2. For n > let x n be the nth corrected approximation (cf. item^ of lemma 2.4) 
of MINRES applied to (2.11) with inner product (t)^vi an d initial guess xq. 
The residuals r n := Mr^b — Mr x Ax n then fulfill 



\ n \ M - min max IK^»)I- 

\\ r o\\ M pen°ie{k+i,...,N}' 



(2.14) 



Proof. 

1. From the definition of Vm m definition 2.3 we directly obtain V mMt 1 AU = 
and thus know that is an eigenvalue of T j m-M~ 1 A with multiplicity at least 
k. Let the elements of V € H N ~ k be orthonormal and such that A4~ 1 AV = 
VT) 2 with D 2 = diag(A fe+1) ...,Ajv). Then (U,V) M = because M~ 1 A 
is self-adjoint with respect to {-r) M - Thus VmY = V and the statement 
follows from V M Mr x AV = VT> 2 . 

2. Because the residual corresponding to the corrected initial guess is r = 
T-Vt-M -1 ^ — Axq) e range(U) M = range(T^), where V is defined as in 
1., we have r = Vr% for a r% S K N - k . Then with D 2 as in 1. we obtain by 



using the orthonormality of V similar to (2.5) 



r n \\ M = min ^{VmM- 1 A)Vr^ 



\M 



min IIV^D^rJ' 



\M 



< IkolL^i m in max |p(Aj)|. 
- 11 u|l - M pen o ie{k+i,...,N} 1 n 



min||p(D 2 )r y || 2 



Implementational notes. Item[T]of lemma 2.4 states that VmM- 
thus the MINRES method can be applied to the linear system 



M.- X V and 



MT x VAx = M~ x Vb 



(2.15) 
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(b) Computational cost. 



Table 2.1: Storage requirements and computational cost of the projection operators 



V and V* (cf. definition 2.3 and lemma 2.4) 



instead of (2.11). When an approximate solution x n of (2.15) is satisfactory then the 
correction (2.12) has to be applied to obtain an approximate solution of the original 
system (2.1). Note that neither A4 nor its inverse A4 _1 show up in the definition 
of the operator V or its adjoint operator V* which is used in the correction. Thus 
the preconditioner Mr 1 does not have to be applied to additional vectors if deflation 
is used. This can be a major advantage since the application of the preconditioner 
operator Mr 1 is the most expensive part in many applications. 

The deflation operator V as defined in definition 2.3 with U € H k needs to store 



AU should be pre-computed and stored. 
■k,k or ^ g mverse hag t be stored. The 



2k vectors because aside from U also C : 
Furthermore the matrix E := (U,C) H G 
adjoint operator V* needs exactly the same data so no more storage is required. The 
construction of C needs k applications of the operator A but - as stated above - no 
application of the preconditioner operator Mr 1 . Because E is Hermitian k(k + l)/2 
inner products have to be computed. One application of V or V* requires k inner 
products, the solution of a linear system with the Hermitian k-by-k matrix E and k 



vector updates. We gather this information in table 2.1 

Instead of correcting the last approximation x n it is also possible to start with 
the corrected initial guess 



x = V*x + U(U ) AU)- 1 (U,b) H 



(2.16) 



and to use V* as a right "preconditioner" (note that V* is singular in general). The 
difference is mainly of algorithmic nature and will be described very briefly. 

For an invertible linear operator B : H — > H the right preconditioned system 
ABy = b can be solved for y and then the original solution can be obtained from 
x = By. Instead of xq the initial guess yo := B~ 1 xq is used and the initial residual 
ro = b — AByo = b — Axq equals the residual of the unpreconditioned system. Then 
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iterates 



y n = ya + z n with z n e K, n (AB, r ) 



and x n '■= By n — xq + Bz n are constructed such that the residual r n = b — ABy n 
I, — Ax n is minimal in \\-\\ H . If the operator AB is self-adjoint the MINRES method 
can again be used to solve this minimization problem. Note that j/o is n °t needed 
and will never be computed explicitly. The right preconditioning can of course be 
combined with a positive definite preconditioner as described in the introduction of 
section 

We now take a closer look at the case B = V* which differs from the above descrip- 
tion because V* is not invertible in general. However, even if the right preconditioned 
system is not consistent (i.e., b range^T 3 *)) the above strategy can be used to solve 



the original linear system. Let us with xq from equation (2.16) construct iterates 



xa + V*y n with ynelCniM^AV*,^) 



(2.17) 



such that the residual 



M^b- M^Axr, 



(2.18) 



has minimal 



-norm. Inserting (|2.17) and the definition of xq into (|2.18|) yields 
K-niM^AV* 



r n = M~ x Vb- M- x VAy n with y n e K^M^AV*,^) = fC^M^VA,^). The 
minimization problem is thus the same as in the case where MINRES is applied to 



the linear system (2.15) and because both the operators and initial vectors coincide 
the same Lanczos relation holds. Consequently the MINRES method can be applied 
for the right preconditioned system 



M~ 1 AV*y = M~ 1 b, x = V*y 



(2.19) 



with the corrected initial guess xq from equation (2.16). The key issue here is that 



the initial guess is treated as in (2.17). A deflated and preconditioned MINRES 
implementation following these ideas only needs the operator V* and the corrected 
initial guess xq. A correction step at the end then is unnecessary. 

2.3. Ritz vector computation. So far we considered a single linear system and 
assumed that a basis for the construction of the projection used in the deflated system 
is given (e.g. eigenvectors are given). We now turn to a sequence of preconditioned 
linear systems 



M^A {i) x 



(2.20) 



where M^,A^) : L(H) are invertible and self-adjoint with respect to (•;•)#, M-U) 
is positive definite and x^\b^ € H for i £ {1, ... , M}. To improve the readability 
we use subscript indices for operators and superscript indices for elements or tuples 
of the Hilbert space H . Such a sequence may arise from a time dependent problem 
or a nonlinear equation where solutions are approximated using Newton's method 
(cf. section |3j) . We now assume that the operator Ai^^^A^+i) only differs slightly 

from the previous operator M-V^A^y Then it may be worthwhile to extract some 
eigenvector approximations from the Krylov subspace and the deflation subspace used 
in the solution of the ith system in order to accelerate convergence of the next system 
by deflating these extracted approximate eigenvectors. 
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For explaining the strategy in more detail we omit the sequence index for a 
moment and always refer to the ith linear system if not specified otherwise. As- 
sume that we used a tuple U G H k whose elements form a (•, -^-orthonormal basis 



to set up the projection Vm ( c f- definition 2.3) for the ith linear system (2.20). 
We then assume that the deflated and preconditioned MINRES method with inner 
product {-,-)m arL< ^ initial guess xq computed a satisfactory approximate solution 
after n steps. The MINRES method then constructed a basis of the Krylov sub- 
space K. n (VM-^~ 1 A, r ) where the initial residual is r = VmM-~ X ^ — Axq). Due 
to the definition of the projection we know that JC n (T'M-M~ 1 A,r ) ±m range({7) 
and we now wish to compute approximate eigenvectors of Ai~ 1 A in the subspace 
S := K, n (T 'mM- 1 l A,r ) U range(?7). We can then pick some approximate eigenvec- 
tors according to the corresponding approximate eigenvalues and the approximation 
quality in order to construct a projection for the deflation of the (i+l)st linear system. 
Let us recall the definition of Ritz pairs [33J : 

Definition 2.6. Let S C H be a finite dimensional subspace and let B : H — > H 
be a linear operator, (w, (i) G S x C is called a Ritz pair of B with respect to S and 
the inner product (•, •) if 

Bw — fiw J-/.,.) S. 



The following lemma gives insight into how the Ritz pairs of the operator Mr 1 A 
with respect to the Krylov subspace K, n (VM-M~ 1 A,ro) and the deflation subspace 
range(£/) can be obtained from data that is available when the MINRES method 
found a satisfactory approximate solution of the last linear system. 

Lemma 2.7. Let A4,A : H — > H be linear, invertible, and (•,•) H -self-adjoint 
and let M. be positive- definite. Furthermore, let U G H be such that (U,U) M = Ik 
and let Vm be defined as in definition \2.3\ 

Assume that the Lanczos algorithm with inner product (•, ■) M applied to the op- 
erator Vm-Mt 1 A and an initial vector v £ range([/) J - A4 proceeds to the nth iteration 
and the Lanczos relation 



VmM-'AV,, = V n+1 T r 



(2.21) 



holds with V n +i = [v\, 

R" +1 >" where s n € E 
real-valued. Let S := 



, Vn+l 



] G H n +\ (V T 



is nonnegative and T„ € 



T„ 

• • • S n _ 

is tridiagonal, symmetric, and 



I„+i andT n 



w G 



IC n (V M M~ 1 A,v) U range(£7) and w = [V n ,U]w G S fo 



Then {w,fi) G S x K is a Ritz pair of M. 1 A with respect to S and the inner 
product (•, ■) M if and only if 



BE X B H 
B H 



w 



(2.22) 



where B := (V n ,AU) H and E := (U,AU) H . 

Furthermore, the squared \\-\\ M -norm of the residual M~ l Aw — /iw is 



\M- x Aw 



flW 



lL = ( G ^) H 



I„4 
B 1 



B 
F 
E 




E 
Ik 



Gi 



(2.23) 
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where 



B = (V n+1 ,AU) H = 



G = 



B 

(v n+ i,AU) H 
F = (AU,M~ 1 AU) H and 

E !B H I k 

-/A k 



with T, = 



Proof, (w, fi) is a Ritz pair of A4 X A with respect to 5 — range([V r „, [/]) and the 
inner product (•, ■) M if and only if 

Mr 1 Aw — fiw J-m S 
(s, M^Aw - nw) M = Vs G 5 
^ ([K,t/],(M"M-^)[K,t/])^w = 

([K, C/],X-U[K, C^D^ffi = M([V r n, I/], [K, E/])^™ 
([V n ,U],M- 1 A[V n ,U}) M w = f j,w 

where the last equivalence follows from the orthonormality of U and V n and the fact 
that range(f7) -L^i fcn^yP m.M~ x A,v) — range(V^). We decompose the left hand side 
as 



([V n ,U],M- l A[V n ,U]) 



M 



(V n) M- l AV n ) M (V n ,M~ 1 AU) M 
(U,M- 1 AV n ) M (U 1 M- 1 AU) M 



The Lanczos relation (2.21) is equivalent to 

M~ l AV„ = Vn+^ + M^AU^AU^iAU^jj 
from which we can conclude with the (•, -^-orthonormality of [V n +i, U] that 



(2.24) 



(V n ,M~ 1 AV n ) M = {V n ,V n+1 ) M T n + (V n ,M- 1 AU) M {U 1 AU) H 1 {AU ) V n ) H 
= T„ + {V n ,AU) H {U,AU)- l {AU,V n ) H . 

The characterization of Ritz pairs is complete by recognizing that B = (V n , M.~ 1 AU) M = 
(V n ,AU) H = (UM^AVn)^ andE = {U,M~ l AU) = (U,AU) H 



Only the residual norm equation remains to show. Therefore we compute with ( 2.24 1 
Mr 1 Aw - /iw = M~ 1 A[V n , U]w - n[V n , U]w 



[V n+1 ,M~ l AU,U] 



T„ - /4„ 
E-'B H I fc 
-/jl k 



w 



[Vn+uM^AU^Gi 



The squared residual 



\\M 



-norm thus is 



iM^Aw - i*w\\ M = {Gw^^Vn+uM^AU^UVn+uM^AU^j^Gw 
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where ( [V n+l , Mr 1 MI, U] , [V n+l , M~ X AU, U] ) 

with the same techniques as in 1. and 2. □ 
Remark 1. Lemma 



M 



In+1 

B H 





B 
F 
E 




E 



can be shown 



2.7 also holds for the (rare) case that JC n (T'M-M~ 1 A, v) is 
an invariant subspace of Vm-M~~ 1 A which we excluded for readability reasons. The 
Lanczos relation (2.21| in this case is Vm-M~ 1 AV ti = V n T n which does not change 
the result. 

shows how a Lanczos relation for the operator V 'm-M 1 A (that can 



Lemma 



2.7 



be genera ted i mplicitly in the deflated and preconditioned MINRES algorithm, cf. end 
of section 2.2) can be used to obtain Ritz pairs of the "undeflated" operator Mr 1 A. 
An algorithm for the solution of the sequence of linear systems (2.20) as described 
in the beginning of this subsection is given in algorithm [T] Additional to the Ritz 
vectors this algorithm can include auxiliary deflation vectors Y^K 



Algorithm 1 Algorithm for the solution of the sequence of linear systems (2.20) 



Input: For i G {1, . . . , M} we have: 

• M-(i) € L{H) is (•, -^-self-adjoint and positive-definite. > preconditioner 

• A(i) G L(H) is (■, -^-self-adjoint. > operator 

• b®,xf e H - 

• y« G H l w for l {i) G N . 
W = [ ] G H° 
for i = 1 — > M do 



> right hand side and initial guess 
> auxiliary deflation vectors (may be empty) 
> no Ritz vectors available in first step 



U = orthonormal basis of span[TF, Y"W] with respect to (•, •)_ M( . ) 
C = A {l) U, E = (U, C) H , set up V* 



> cf. definition 



2.3 



X 



V*x^ ] + U'E- 1 (U, b®) H 



> corrected initial guess 



x% } , V n+ i, T n) B = MINRES b w , M^, V* , x , e) 

MINRES is applied to MT^A^x^ = MT}b® with inner product 
{■■,-)mu ' r ight preconditioner V* , initial guess xq and tolerance e > 0, 



cf. section E2 Then: 



The approximation x$ fulfills 



The Lanzcos relation M^Aa)V*V r 



M^-M^A (i) x$ 
V n+1 T n holds. 



< e. 



M 



(0 



• i lie Lanzcos rciaiion . v i , •M.U)! 

• B = (V n , C) H is generated as a byproduct of the application of V* 



• , w m , (ii,. . . , fi m , pi, 



, p m = Ritz(C/, V n+ i, T„, B, C, E, M^]) 
Ritz(. . .) computes the Ritz pairs (wj,pj) for j G {1, . . . , m} of JAT^A^ 
with respect to span[J7, V n ] and the inner product 
Then: 



'Mi 



cf. lemma 



2.7 



• Wi, . . . , w rn form a (•, -)^ ( -orthonormal basis of span[J7, V n 

• The residual norms pj — 
turned. 



M7}A 



are also re- 



M 



(i) 



W 



\w, 



for pairwise distinct ii 



,i k G {l,...,m}. 



Pick fc Ritz vectors according to Ritz value and residual norm. 
9: end for 
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Applications of Inner Vector 

A M" 1 M products updates 



Orthogonalization 




k k(k - 


-l)/2 


k{k + 


l)/2 


Setup of V* and initial guess Xq k 




fcffcH 


-5)/2 


2k 


n MINRES iterations n 


n 


n{k 


+ 2) 


n(k - 


1-7) 


Computation of Ritz vectors 


k 


I 


a 


k(k- 


-n) 



Table 2.2: Computational cost for one iteration of algorithm [l| (lines |3j-|SJ) with n 
MINRES iterations and k deflation vectors. The number of computed Ritz vectors 
also is k. Operations that do not depend on the dimension ./V := dimi? are neglected. 



Implementational notes. We now comment on the implementational side of the 
determination and utilization of Ritz pairs while solving a sequence of linear systems 
(cf . algorithm[T]) . The solution of a single linear system with the deflated and precondi- 
tioned MINRES method was discussed in section [272} Although the MINRES method 
is based on short recurrences due to the underlying Lanczos algorithm - and thus only 
needs storage for a few vectors - we still have to store the full Lanczos basis V n +i 
for the determination of Ritz vectors. The Lanczos matrix T n € K n + 1 . rl is no issue 
because it is tridiagonal and symmetric. However, by employing a "good" precondi- 
tioner that limits the number of iterations we are able to satisfy memory constraints 



in many applications (cf. section 3.3). Deflation can then be used to further improve 
convergence by directly addressing parts of the preconditioned operator's spectrum. 
An annotated version of the algorithm can be found in algorithm [TJ Note that the 
orthonormalization in line [3] is redundant in exact arithmetic if only Ritz vectors are 
used and the preconditioner does not change. An overview of the computational cost 
of one iteration of algorithm [TJ is given in table |2.2| 

3. Application to nonlinear Schrodinger problems. Given an open domain 
57 C W l , d £ {2, 3}, nonlinear Schrodinger operators are typically derived from the 
minimization of the Gibbs energy in a corresponding physical system and have the 
form 

S(ip) := ()C + V + g\i>\ 2 )ij inQ 

with X C L 2 (il) being the natural energy space of the problem, Y its dual. If 
the domain is bounded, the energy space X may incorporate boundary conditions 
appropriate to the physical setting. The linear operator JC is assumed to be symmetric 
and positive-semidefinite, V : — > K be a given scalar potential, and g E R a 
given coupling parameter. A state ip : fl — > C is called a solution of the nonlinear 
Schrodinger equation if 

S($)=0. (3.2) 

Generally, one is only interested in nontrivial solutions ip ^ 0. The function ifr is often 
referred to as order parameter and its magnitude \ip\ 2 typically describes a particle 
density or, more general, a probability distribution. Note that, because of 



S(exp{ix}V0 = exP^x}" 5 ^) 



(3.3) 
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one solution ip € X is really just a representative of the physically equivalent solutions 
{exp{ix}"0 : X e K}- 



For the numerical solution of (3.2 1, Newton's method is the method of choice. 



Given a good-enough initial guess ipo, the Newton pro cess generates a sequence of 



iterates ipk which converges towards a solution ip of (3.2 1. In each step k of Newton's 
method, a linear system with the Jacobian J (ipk) of S at ipk needs to be solved. The 
linear operator J(ip) is given by 



Jty) :X^Y, 

J(i/>)<p ■= (AC + V + 2<#| 2 ) y + gtfhp 



Despite the fact that states ip are generally complex-valued, J{ip) is linear only if 
X and Y are defined as vector spaces over the field K with the corresponding inner 
product 

<V>K : =»<V>L»(£1)- (3-5) 

This matches the notion that the specific complex argument of the order parameter 
is of no physical relevancy since \aip\ 2 = \\a\4>\ 2 for all a € C, ip £ X (compare 



with (pU|). 

Moreover, the work in [5S| gives a representation of adjoints of operators of the 
form A3. 41), from which one can derive 



Corollary 1. For any given ip € Y , the Jacobian operator J{ip) \3.4\) is se 



adjoint with respect to the inner product (3.5) 



An important consequence of the independence of states of the complex argument 



(3.3) is the fact that solutions of (3.10) form a smooth manifold in X. This results in 
the linearization (3.4) in solutions always having a nontrivial kernel. Indeed, for any 

ipeX 

J {$){*!>) = (IC + V + 2g\^j\ 2 ) (»(,) - gaffi = i(JC + V + V> = i<S(V), (3.6) 

so for nontrivial solutions ip € X, ip ^ 0, F(tp) = 0, the dimension of the kernel of 
J(ip) is at least 1. 

Besides the fact that there is always a zero eigenvalue in a solution ip s and that 
all eigenvalues are real, not much more can be said about the spectrum; in general, 
J{ip) is indefinite. The definitencss depends entirely on the state ip; if ip is a solu- 



tion to (3.1), it is said to be stable or unstable depending whether or not J{ip) has 
negative eigenvalues. Typically, solutions with low energies tend to be stable whereas 
highly energetic solutions tend to be unstable. For physical systems in practice, it is 
uncommon to see more than ten negative eigenvalues for a given solution state. 

3.1. Principal problems for the numerical solution. While the numerical 
solution of nonlinear systems itself is challenging, the presence of a singularity in a 



solution as in (3.6) adds two major obstacles for using Newton's method. 

• Newton's method is guaranteed to converge towards a solution x Q-supcrlin- 
early in the area of attraction only if x is nondegenerate, i.e., the Jacobian 
in x is regular. If the Jacobian operator does have a singularity, only linear 
convergence can be guaranteed. 

• While no linear system has to be solved with the exactly singular J(ip), the 
Jacobian operator close the solution J(ip + Sip) will have at least one eigen- 
value of small magnitude, i.e., the Jacobian system becomes ill-conditioned 
when approaching a solution. 
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Several approaches have been suggested to deal with this situation, for a concise 
survey of the matter, see [13] . One of the most used strategies is bordering which 
suggests extending the original problem S(ip) — by a so-called phase condition to 
pin down the redundancy [T], 

= %,A):= W+^V (3.7) 



p{x) 

If y and p(-) are chosen according to some well-understood criteria [14] . the Jacobian 
systems can be shown to be well-conditioned throughout the Newton process. More- 
over, the bordering can be chosen in such a way that the linearization of the extended 
system is self-adjoint in extended scalar product if the linearization of the original 
problem was self-adjoint. This method has been applied to the specialization of the 



Ginzburg-Landau equations before [28 (3.101, and naturally generalizes to nonlinear 



Schrodinger equations in the same way. The major disadvantage of the bordering ap- 
proach, however, is that it is not clear how to precondition the extended system even 
if a good preconditioner for the original problem is known. This renders bordering 
unfeasible for the discretization of nonlinear PDEs with a large number of unknowns. 

In the particular case of nonlinear Schrodinger equations, the loss of speed of 
convergence is less severe than in more general settings. Note that there would be no 
slowdown at all if the Newton update Sip, given by 

jtyW = -s(V0, ( 3 - 8 ) 

was consistently orthogonal to the null space lip close to a solution ip. While this is 
not generally true, one is at least in the situation that the Newton update can never 
be an exact multiple of the direction of the approximate null space up. This is because 

J{ip){auP) = -S(*P), ael, 



together with (3.6), is equivalent to 

aiS(ip) = -S(ip) 

which can only be fulfilled if S(ip) — 0, i.e., if ip is already a solution. 

Consequently, loss of Q-supcrlinear convergence is hardly ever observed in nu- 
merical experiments. Figure |3.1[ for example, shows the Newton residual for the 
two- and three-dimensional test setups, both with the standard formulation and with 



the bordering 3.7 as proposed in [28]. Of course, the Newton iterates follow differ- 
ent trajectories, but the important thing to note is that in both plain and bordered 
formulation, the speed of convergence close the solution is comparable. 

The more severe restriction is in the numerical difficulty of solving the Jacobian 
systems in each Newton step due to the increasing ill-posedness of the problem as 
described above. However, although the Jacobian has a nontrivial near-null space 
close to a solution, the problem is well-defined at all times. This is because, by self- 
adjointness, its left near-null space coincides with the right near-null space, span{i^}, 



and the right-hand-side in (3.8), —S(ip), is orthogonal to vp for any ip: 



(i^s(V0> R = W>,£W0)r + (#> v(ip)) R + (up,g\ip\ 2 ip) M 

= R(i(iP,lCiP) 2 ) +R(i{iP,ViP) 2 ) + M(gi(\iP\ 2 ,\iP\ 2 ) ) =0. (3.9) 
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The numerical problem is hence caused only by the fact that one eigenvalue approaches 
the origin as the Newton iterates approach a solution. The authors propose to handle 
this difficulty on the level of the linear solves for the Newton updates using the 
deflation framework developed in section [2] 

3.2. The Ginzburg Landau equation. One important instance of nonlinear 



Schrodinger equations (3.1) is the Ginzburg-Landau equation that models supercur- 
rent density for extreme-type-II superconductors. Given an open, bounded domain 
Q C M d , d £ {2, 3}, the equations are 



"/C^-^l-M 2 ), 
n • (-iV - A)ij). 



(3.10) 



The operator K, is defined as 



JC : X -» Y, 
Jdp := (-iV 



A' 2 



ip. 



(3.11) 



with the magnetic vector potential A £ H^ d (rt) [5 ]. The operator K, describes the 
energy of a charged particle under the influence of a magnetic field B = V x A, and 
can be shown to be Hermitian and positive-semidefinite; the eigenvalue is assumed 
only for A = [33]. Solutions if) of (3.10) describe the density \ip\ 2 of electric charge 
carriers and fulfill < \tp\ 2 < 1 pointwise [IB]. For two-dimensional domains, they 
typically exhibit isolated zeros referred to as vortices; in three dimensions, lines of 



zeros are the typical solution pattern (see figure 3.2) 



Discretization. For the numerical experiments in this paper, a finite-volume-type 
discretization is employed [U[S5]. Let be a discretization of fl with a triangu- 
lation {T t }™ 1; (J™ i Ti = ^ (h \ and the node-centered Vor onoi tessellation {Vfe}^ =1 , 
Ufe=i Vk — ^ • Let further denote the edge between two nodes i, j. The 
discretized problem is then to find ipw e C" such that 

Vfc G {1, . . . , n} : = (S-W*/^) h := (K W^W) k - ^ (l - 1^ | 2 ) , (3.12) 
where the discrete kinetic energy operator K^ h ' is defined by 



£ ay [(4 h) - M h) ) ^ + - t ] ] (3-13) 

edges 7 - 



with the discrete inner product 



k=i 



and edge coefficients ct!jj S K [29]. The magnetic vector potential A is incorporated 
in the so-called link variables, 



Uij := cxp —l / eij ■ A(w) dw 
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along the edges e^j of the triangulation. 

Remark 2. In matrix form, the operator is represented as a product 
I ) l\ of the diagonal matrix D , Di^ = \Vi\, and a Hermitian matrix K . 

This discretization preserves a number of invariants of the problem, e.g., gauge 
invariance of the type ip ■— expjix}, A := A + V% with a given \ S C ,1 (^)- Moreover, 
the discretized energy operator K^ h > is Hermitian and positive-definite [29]. The 
discretized Jacobian operator is self-adjoint with respect to the discrete inner product 

(^\^) R .= ^(±\V k \^U[ h) ) (3.14) 



and the statements (3.6), (3.9) about the null space carry over from the continuous 
formulation. 

Remark 3 (Real- valued formulation) . There is a vector space isomorphism a : 
C™ — > R 2 ™ between K 2 ™ and C™ as vector space over K given by the basis mapping 

, (n)\ (2n) / (n)\ (2n) 

a(e) ')=e) a(ie) ') = e^ + j. 

In particular, note that the dimension of Cg is In. The isomorphism a is also iso- 
metric with the natural inner product (•, -) R o/C^, since for any given pair </>, ip € C" 
one /ias 



Moreover, linear operators over Cg generally have the form Lip = Alp + Bip with some 
A,B e C nx " and because of 

Lw = Xw ■<=> (aLa~ 1 )aw = Xaw, 

the spectral properties also exactly convey to its real-valued image cxLoT 1 . 

This equivalence can be relevant in practice as quite commonly, the original com- 
plex-valued problem in C™ is implemented in terms R 2n . Using the natural inner 
product in this space will yield the expected results without having to take particular 
care of the inner product. 

3.3. Numerical experiments. The numerical experiments are performed with 
the following two setups. 

Test setup 1 (2D). The circle S1 2 d : = {x e M 2 : ||ac|| 2 < 5} and the mag- 
netic vector potential A(x) := m x (x — x Q )/\\x — £U 1 1 3 with m := (0,0, 1) T and 
Xq := (0, 0, 5) T , corresponding to the magnetic field generated by a dipole at Xo with 
orientation m. A Delaunay triangulation for this domain was created using Trian- 
gle 130\/ . bearing 3299 points. With the discrete equivalent of ipo( x ) = cos(7rx) as 
initial guess, the Newton process converges after 27 iterations with a residual of less 



than 10 10 in the discretized norm (see figure 3.1). The final state is illustrated in 
figure [37 



Test setup 2 (3D) . The three-dimensional L-shape 
Q SD := {iel 3 : Halloo < 5}\ to3 



+ - 



discretized using Gmsh \11^ with 72166 points. The chosen magnetic vector field is 
constant Bsd(x) := 3 -1 / 2 (l, 1, 1) T , represented by the vector potential Ag£,{x) '■= 
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30 
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30 



Fig. 3.1: Newton residual history for the two-dimensional setup [l 

dimensional setup [2] (right), each with bordering ( | | and without ( 

guesses ipQ D (x) = cos(7ry) and ^^(x) = 1, respectively, the Newton process delivered 



(left) and three- 
With initial 



the solutions as highlighted in figure 3.2 in 22 and 27 steps, respectively. 




(a) Cooper-pair density 
H 2 - 




I 







(b) Cooper-pair density \tp\ 2 at the 
surface of the domain. 




(c) argtp. 




(d) Isosurface with \ip\ 2 — 0.1 (see 
(b)), axgt/j at the back sides of the 
cube. 



Fig. 3.2: Solutions of the test problems as found in the Newton process illustrated in 
figure RTT 
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nBsD x x - With the discrete equivalent of ipo{ x ) — 1> the Newton process converges 
after 22 iterations with a residual of less than 10~ 10 in the discretized norm (see 



figure 3.1). The final state is illustrated in figure 3.2 



For both of representative setups, a Newton run was performed where the linear 



systems (3.8) were solved using MINRES to exploit self-adjointness of J^ h \ Note that 



it is critical here to use the natural inner product of the system, (3.14). All of the 
numerical experiments incorporate the preconditioner proposed in 29J that is shown 
to bound the number of Krylov iterations needed to reach a certain relative residual 
by a constant independently of the number n of unknowns in the system. 

Remark 4. Neither of the above test problems have initial guesses which sit 
in the cone of attraction of the solution they eventually converge to. As typical for 
local nonlinear solvers, the iterations which do not directly correspond with the final 
convergence are sensitive to effects introduced by the discretization or round-off errors. 
It will hence be difficult to reproduce the shown solutions without exact information 
about the point coordinates in the discretization mesh. Nevertheless, the numerical 
results hold true for any converging Newton iteration, independently of their final 
state. 



Figure 3.3 shows the relative residuals for all Newton steps in both the two- and 
the three-dimensional setup. Note that the residual curves late in the Newton process 
(dark gray) exhibit plateaus of stagnation which are attributed to the low-magnitude 
eigenvalue associated with the near-null space vector iip( h \ 

Figure [3. 3b incorporates deflation of said vector via the framework as described in 



section [2j The incorporation of the preconditioner and the customized inner product 
|3.14| is crucial here. Clearly, the stagnation effects are remedied and a significantly 
lower number of iterations is necessary to reduce the residual norm to 10~ 10 . While 



this comes with extra computational cost per step (cf. table 2.1 ), this cost is negligible 
compared to the considerable convergence speedup. 



Remark 5. Note that the initial guess Xq is adapted according to (2.16) before 
the beginning the iteration. Because of that, the initial relative residual Aa?o||/||&— 
Axq\\ cannot generally be expected to equal 1 even if Xq = 0. In the particular case of 
U = lip, however, we have 



x 



V*x Q + U (U, JWWr 1 (U, S(iP)) n = V*x 



figure 3.3W. Note that this is not true anymore when more deflation vectors are added 



since (lip, S(ip) — (3.9), and the initial relative residual does equal 1 if xq = (cf. 



(cf. figure \3.3c\ ). 

As indicated in section[2j the idea of deflation can be taken further by noting that, 
towards the end of the Newton process, a series of very similar linear systems needs 
to be solved. This leads to the idea of extracting spectral information of the previous 
MINRES iteration use for deflation in the present process. For the experiments, 
those 12 Ritz vectors from the MINRES iteration in Newton step k which belong 
to the Ritz values of smallest magnitude were added for deflation in Newton step 
k+1. As displayed in figure [3~3c] the number of necessary Krylov iterations is further 
decreased roughly by a factor of 2. Note also that in particular, the characteristic 
plateaus corresponding with the low-magnitude eigenvalue to not anymore occur. This 
is particularly interesting since no information about the approximate null space was 
explicit specified, but automatically extracted from previous Newton steps. 

Technically, one could go ahead and extract even more Ritz vectors for deflation in 
the next step. However, at some point the extra cost associated with the extraction of 
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MINRES iteration MINRES iteration 

(c) Deflation of 12 Ritz vectors corresponding to the Ritz values of smallest magnitude. 

Fig. 3.3: MINRES convergence histories of all Newton steps for the 2D problem (left) 
and 3D problem (right). The color of the curve corresponds to the Newton step: light 
gray is the first Newton step while black is the last Newton step. 
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Fig. 3.4: Wall-times T p needed for MINRES solves for the test setups with deflation of 
those p Ritz vectors from the previous Newton step which correspond to the smallest 
Ritz values. As in the figure |3.3[ light gray lines correspond to steps early in the 
Newton process. All times are displayed relative to the computing time T without 
deflation. The dashed line at T p /Tq = 1 marks the threshold below which deflation 
pays off. 



the Ritz vectors (table [272] ) and the application of the projection operator (table 2.1 1 
will not justify a further increase of the deflation space. The efficiency threshold 
will be highly dependent on the cost of the preconditioner. Moreover, it is in most 
situations impossible to predict just how the deflation of a particular set of vectors 
influences the residual behavior in a Krylov process. For this reason, one has to reside 
to numerical experiments to estimate the optimal dimension of the deflation space. 
Figure |3.4| shows, again for all Newton steps in both setups, the wall time of the 



Krylov iterations as in figure 3.3 relative to the solution time without deflation. The 
experiments show that deflation in the first few Newton steps does not accelerate the 
computing speed. This is due to the fact that the Newton updates are still significantly 
large and the subsequent linear systems are too different from each other to take profit 
from carrying over spectral information. As the Newton process advances and the 
updates become smaller, the subsequent linear systems become similar and deflation 
of a number of vectors becomes profitable. Note, however, that there is a point at 
which the computational cost of extraction and application of the projection exceeds 
the gain in Krylov iterations. For the two-dimensional setup, this value is around 
12 while in the three-dimensional case, the minimum roughly stretches from 10 to 20 
deflated Ritz vectors. In both cases, a reduction of effective computation time by 40% 
could be achieved. 

Remark 6. Note that throughout the numerical experiments performed in this 
paper, the linear systems were solved in up to the relative residual o/10~ 10 . In prac- 
tice, however, one would employ a relaxation scheme as given in, e.g., [6, 251. Those 
schemes commonly advocate a relaxed relative tolerance T)k in regions of slow con- 
vergence, and a more stringent condition when the speed of convergence accelerates 
toward a solution, e.g., 



\\m 
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with some 7 > 0, a > 1. In the specific case of nonlinear Schrddinger equations, this 



means that deflation of the near-null vector lip (cf. figure 3.3b) becomes little effective 



if Vk is higher than where the stagnation plateau appears. The speedup associated 



with deflation with a number of Ritz vectors (cf. figure 3.3c), however, is effective 
throughout the Krylov iteration and would hence not be influenced by a premature 
abortion of the process. 

Remark 7. As opposed to the CG method, there is no consensus on what im- 
plementation of the MINRES procedure is least prone to round-off errors. In general, 
the MINRES iteration can be unstable to the point of complete stagnation well above 



machine precision [33] (see the stagnating process in figure 3.3a). This is attributed 
to the implicit loss of orthogonality of the Ritz vectors due to round- off errors and can 
thus remedied by storing all computed Ritz vectors and reorthogonalizing against the 
set in each MINRES iteration. This essentially leads to the GMRES process. 

All numerical experiments in this paper have been conducted with and without 
explicit reorthogonalization and despite the observation of a loss of orthogonality in 
the Ritz vectors, the convergence speedup with Ritz vectors from the previous step was 
not noticeably affected. 

Remark 8. The methods described and developed in this paper are implemented 
in the Python package pynosh JfJjl made available under a free and open source license 
(GNU General Public License 3). All experimental results presented in this section 
can be reproduced from the data published with the source code. 

4. Conclusions. For the solution of a sequence of self-adjoint linear systems 
such as occurring in Newton process for a large class of nonlinear problems, the au- 
thors propose a MINRES scheme that takes into account spectral information from 
the previous linear systems. Central to the approach is the cheap extraction of Ritz 



vectors (section 2.3 1 out of a MINRES iteration and the application of the projec- 
tion (2.101. As opposed to similar recycling methods previously suggested [36] . the 
present one consistently preserved self-adjointness of the projected operator, explicitly 
incorporates the use of a preconditioner, and allows for inner products other than the 
default pinner product. 

These properties are then crucial for the application to nonlinear Schrodinger 
equations (section |3|: The occurring linearization is self-adjoint with respect to a 



nonstandard inner product (3.14), and the computation in a three-dimensional set- 
ting is not feasible without a preconditioner. The authors could show that for the 
particular case of the Ginzburg-Landau equations, the deflation strategy reduces the 



effective run time of a linear solve by up to 40% (cf. figure 3.3c). Moreover, the defla- 
tion strategy was shown to automatically handle the singularity of the problem that 
otherwise leads to numerical instabilities. 

It is expected that the strategy will perform similarly for other nonlinear problems. 
While adding a number of vectors to the deflation will always lead to faster Krylov 
convergence, it only comes with extra computational cost in extracting the Ritz vector 
and applying the projection operator. The optimal number of deflated Ritz vectors is 
highly problem-dependent, in particular the computational cost of the preconditioner, 
and can thus hardly be determined a priori. 

The proposed strategy naturally extends to problems which are not self-adjoint 
by choosing, e.g., GMRES as the hosting Krylov method. For asymmetric problems, 
however, the effects of altered spectra on the Krylov convergence is far more involved 
than in the self-adjoint case [2D]. This also makes the choice of Ritz vectors for 
deflation difficult. 
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